## Introduction ##
**[AdaS]** is an adaptive optimizer for scheduling the learning rate in training Convolutional Neural Networks (CNN)

- AdaS exhibits the rapid minimization characteristics that adaptive optimizers like [AdaM](https://arxiv.org/abs/1412.6980) are favoured for
- AdaS exhibits *generalization* (low testing loss) characteristics on par with SGD based optimizers, improving on the poor *generalization* characteristics of adaptive optimizers
- AdaS introduces no additional computational overhead over adaptive optimizers (see [experimental results](#some-experimental-results))
- In addition to optimization, AdaS introduces new probing metrics for CNN layer evaulation ([quality metrics](#knowledge-gain-vs-mapping-condition---cnn-quality-metrics))

This repository contains a [PyTorch](https://pytorch.org/) implementation of the AdaS learning rate scheduler algorithm.


## Requirements ##
### Software/Hardware ###
We use `Python 3.7`.

### Computational Overhead ###
AdaS introduces no overhead over adaptive optimizers e.g. all mSGD+StepLR, mSGD+AdaS, AdaM consume 40 min/epoch to train ResNet34/ImageNet using RTX 2080Ti

### Installation ###
There are two versions of the AdaS code contained in this repository.
1. a python-package version of the AdaS code, which can be `pip`-installed.
2. a static python module (unpackaged), runable as a script.

All source code can be found in [src/adas](src/adas)


### Usage ###
Moving forward, I will refer to console usage of this library. IDE usage is no different. Training options are split two ways:
1. all environment/infrastructure options (GPU usage, output paths, etc.) is specified using arguments.
2. training specific options (network, dataset, hyper-parameters, etc.) is specified using a configuration **config.yaml** file:

```yaml
###### Application Specific ######
dataset: 'CIFAR10'
network: 'VGG16'
optimizer: 'SGD'
scheduler: 'AdaS'


###### Suggested Tune ######
init_lr: 0.03
early_stop_threshold: 0.001
optimizer_kwargs:
  momentum: 0.9
  weight_decay: 5e-4
scheduler_kwargs:
  beta: 0.8

###### Suggested Default ######
n_trials: 5
max_epoch: 150
num_workers: 4
early_stop_patience: 10
mini_batch_size: 128
p: 1 # options: 1, 2.
loss: 'cross_entropy'
```